r one of the C terminal residues. The cleavage happens between
ଵ
ᇱ. The requirement of a successful interaction between a protease
avage site in a polyprotein is the amino acid distribution trend
ing a cleavage site. Different proteases require different patterns
mino acid composition trend of a cleavage site. For instance, the
protease cleavage usually happens when ܴଵൌܴ, where R is the
id arginine [Yang, et al., 2006]. The trypsin protease cleavage
happens when ܴଵൌܭ or ܴଵൌܴ, where K is the amino acid
homson, et al., 2003].
uccess of the protease cleavage pattern discovery for a special
therefore depends on the accurate recognition of the amino acid
ion trend within a set of cleaved peptides. Only when this
d pattern has a great discrimination power between cleaved
and non-cleaved peptides, which is demonstrated in a machine
model, following successful inhibitor design against the protease is
ble [Goodman, et al., 1983; Sheth, et al., 1984; Singh, et al., 2001].
inear discriminant analysis algorithm
alysing a biological data set, it is often required to recognise the
of molecules through analysing the factors associated with the
of molecules, such as predicting protein function, gene-gene
n, enzyme cleavage sites and functional genes. This process is
e object classification. When there are only two classes of objects
set, this process is called a discriminant analysis. The final
of a classification model is to label novel data (novel objects) or
ptides in this context.
inear discriminant analysis algorithm (LDA) was developed by
isher as a linear model for classifying data and is also called the
scriminant algorithm [Fisher, 1936]. The earliest application of
PubMed was a perinatal mortality study [Greenberg, 1963].
LDA is a linear model, it lays down the foundation of
tion analysis and it still attracts a significant attention in pattern